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HOW BODY SIZE BECAME 
A DISEASE 


A history of the body mass index and its 
rise to clinical importance 


Katherine M. Flegal 


Introduction 


Taller people tend to weigh more than shorter people. But how does weight vary with 
height? The body mass index (BMI, calculated as weight (W) divided by height (H) squared) 
is one way to express weight adjusted for height. It is widely used, with the advantages of 
being technically uncomplicated, non-invasive, easy to measure and calculate. Where did it 
come from? How did it become the definition of a disease? 


Quetelet and the relation of weight to height 


The 19th-century Belgian statistician Adolphe Quetelet (1796—1874), originally trained as 
an astronomer, had wide-ranging interests in social and physical characteristics of human 
populations, addressing topics such as natality, mortality, crime, education and more. Stigler 
(1986) provides an extensive account of Quetelet’s contributions; see also Eknoyan (2008) 
and Weigley (2000). A pioneer in statistical data gathering and statistical thinking, Quetelet 
wanted to discover the mathematical laws governing social as well as physical phenomena. A 
minor aspect of his work was the investigation of development of body measurements from 
birth through adulthood, and among these was the relation of weight to height in adults. In 
a brief footnote in his 1835 book “Sur l’homme,” Quetelet observed that, for adults, weight 
varied as the square of height (Quetelet, 1835, Vol. 2, p. 54). 

He derived this from empirical observations, not from theoretical considerations. Quet- 
elet recognized that “if man increased equally in all his dimensions, his weight at different 
ages would be as the cube of his height. Now this is not what we really observe” (Quetelet, 
1835, Vol. 2, p 52). He took the 12 tallest and the 12 shortest men within a data set and found 
their average heights and also their average weight/height values. He observed that the ratio 
of mean height of the shortest men to the tallest men was in the proportion of 5 to 6. He ob- 
served that the ratio of mean weight/height of the shortest men to the tallest men was also in 
the proportion of 5 to 6. As long as these two proportions are the same as each other, then, as 
Quetelet’s footnote points out, it can be shown that weight increases as the square of stature. 
He repeated these observations for women, with the same finding. Quetelet was interested 
in the properties of what he called the “l’homme moyen” (the average or typical individual) 
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and did not attempt to investigate departures of weight from what was expected based on 
stature. Quetelet did not propose an index. However, because of his observations, the ratio 
of weight divided by height squared (W/H?) later became known as Quetelet’s index. 

Other early empirical investigations agreed with Quetelet’s findings that weight varied as 
the square of height. In 1869, Benjamin Gould, the actuary for the US military, published 
extensive data on newly demobilized Civil War soldiers (Gould, 1869). Beyond investiga- 
tions of topics ranging from ages at enlistment, place of birth and extending even to topics 
such as hair and eye colours, Gould also addressed anthropometry (body measurements), not 
only weight and height but also many other physical dimensions. As had Quetelet before 
him, Gould observed empirically that weight did not vary as the cube of height but rather 
as the square. According to Gould, “... we are irresistibly led to the singular and interesting 
discovery that the mean weights, vary strictly as the squares of the statures. ... The fact here 
elicited was observed by Quetelet” (Gould, 1869, p. 409). 


Quetelet’s index - just one of many 


Quetelet’s index was just one of a number of different indicators used for descriptive and 
research purposes in the 19th and early 20th centuries to express weight adjusted for height. 
Such indicators were simply ways to standardize weight for height, so that descriptions and 
comparisons of weight could be made across individuals of different heights. As noted by 
Gray and Mayall (1920) and by Billewicz, Kemsley, and Thomson (1962), the nomenclature 
attached to these indicators is not always consistent, and it would require a good deal of his- 
torical research to establish their origins. 

The Broca index, weight (kg) = height (cm) —100, is said to have been developed around 
1871 (Laurent et al., 2020; Rossner, 2007) by the French neuroscientist Paul Pierre Broca 
(1824-1880), although its origins are obscure (Gray & Mayall, 1920). In 1898, the Italian 
doctor Ridolfo Livi (1856—1920) argued that because weight is a measure of volume, the cor- 
rect index would be the cube root of weight, divided by height, which he called the “indice 
ponderal” or ponderal index (Livi, 1898). Swiss physician Fritz Rohrer (1888—1926) devel- 
oped the Rohrer index, weight divided by height cubed, that also used the cube rather than 
the square (Rohrer, 1908, 1921). In a later version, William H. Sheldon (1898-1977), the 
influential creator of the concept of somatotypes (Vertinsky, 2002, 2007), used an inverted 
form of the index, which he also called ponderal index. 

In a very large sample of US military conscripts, however, Davenport observed, “Were 
short people and tall people of the same shape, then it would be true that their weight would 
be expected to vary with the cube of any one dimension. But this assumption is not true” 
(Davenport, 1920, p. 470). He summarized his findings by saying, “for young adult males 
the best index of build is apparently obtained by dividing weight by the square of stature” 
(Davenport, 1920, p. 475). 

In these discussions, weight for height indicators was viewed as a method of standardizing 
weight for height for descriptive purposes. For instance, Livi (1898) felt that his index could 
be used to examine how weight varied by factors such as sex, age, race or environmental 
conditions after adjusting for differences in height. 


Life insurance and the beginnings of “ideal weight” standards 


On a different track, weight for height tables began to be developed and then to be used 
as an indicator of risk for life insurance purposes. Czerniawski (2007) provides a detailed 
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analysis of the development of such tables between the 1830s (when Quetelet published what 
is perhaps the first example of a weight for height table) and 1943, when the Metropolitan 
Life Insurance company presented tables of “ideal weights” by height for men and women. 
Beginning as descriptive, such tables were transformed to a tool used for actuarial purposes 
by life insurance companies and then to recommendations for the general population going 
from “average” to “ideal” weights. Shephard (1907) presented a table compiled in 1897 of 
the average heights and weights by age of men who had been accepted for life insurance, 
stating that weights of 20 percent above or below those average weights indicated a poor ac- 
tuarial risk. In 1908, Symonds (1908), discussing the same table, asserted that someone who 
was 20 percent or more above the standard weight for age should be considered overweight. 
Since the 1897 standard weights increased with age, the weight considered overweight on 
this basis also increased with age. Later, Armstrong, Dublin, Wheatley, and Marks (1951), 
discussing the 1943 Metropolitan Life tables, recommended a fixed set of standards based on 
the ideal weight for ages 25—30, with 10 percent over that weight considered overweight and 
20 percent over ideal weight being pathological overweight or obesity. 

Throughout the 1960s and 1970s, the 1959 Metropolitan Life tables (Metropolitan Life 
Insurance Company (MLIC), 1959) played a dominant role in identifying what were called 
“desirable” weights (Weigley, 1984). Seltzer (1965) commented on the severity and often 
unrealistic requirements of the desirable weights in these tables. A discussion and critique of 
some of the shortcomings of the 1959 version of the Metropolitan Life tables and the later 
1983 version (MLIC, 1983) was presented at the 1985 NIH Conference on the Health Im- 
plications of Obesity (Harrison, 1985). Knapp critiqued both the concept of “ideal weight” 
and the construction of the Metropolitan Life tables, calling the methods by which the 
tables were constructed “almost unfathomable” (Knapp, 1983, p. 507). Jarrett (1986) quoted 
Knapp but added that he questioned the “almost.” Andres, Elahi, Tobin, Muller, and Brant 
(1985) reanalyzed the data on which the 1983 Metropolitan Life tables were based and 
concluded that weight standards should be adjusted for age and should be higher for older 
adults, a finding that was the subject of some controversy (Willett, Stampfer, Manson, & 
Vanitallie, 1991). 


Weight-height indices and fatness 


Early attempts to use weight and height to create an indicator of adiposity (body fatness) 
were limited by the methods of measuring adiposity then available before the advent of 
methods such as dual-energy x-ray absorptiometry (DXA) and magnetic resonance imaging. 
One method involved weighing a person underwater and then following the principles set 
out by Archimedes to determine body density (Forbes, 1999). Although this method, called 
hydrodensitometry or hydrostatic weighing, provided accurate estimates of body density, 
converting body density to a measure of fatness required assumptions about body composi- 
tion that might not be accurate at the individual level. Another method was to use callipers 
to measure the thickness of subcutaneous fat levels at different body sites (skinfold measure- 
ments). With these, one approach was to characterize adiposity simply by the sum of the 
skinfold measurements; another approach was to use prediction equations (e.g., Wilmore and 
Behnke 1969) to generate estimates of body fat from skinfold measurements. Because of the 
difficulties in obtaining measures of adiposity, much of the discussion in the 1960s and 1970s 
about weight—height indices was based on theoretical considerations or on comparisons of 
different indicators against population distributions of weight for height, not on comparisons 
with measures of adiposity. These discussions typically evaluated different values for the 
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power of height in power-type indices of the general form W/H), generally including W/H, 
W/H? and some index using a power of 3, such as H/W'”. 

An early attempt to address the issue of indices of adiposity (Billewicz et al., 1962) had ad- 
iposity data from densitometry only for 81 men and women from Taiwan and 98 American 
infantrymen in training, not enough for a detailed analysis. Therefore, Billewicz et al. (1962) 
primarily compared various functions of weight and height to the distribution of weight for 
height in several other population samples, making the assumption that, in a normal unse- 
lected population, the distribution of body weight at each level of height would reflect, in a 
general way, the distribution of adiposity. This is really the same problem of describing how 
weight varies with height that Quetelet had already addressed without making any assump- 
tions about adiposity. Billewicz et al. (1962) concluded that, of the weight—height indices, 
Quetelet’s index was the best approximation to adiposity but that a better approximation to 
adiposity would be to use the ratio of weight to standard weight from a life insurance table 
of standard weights for height. 

As had Billewicz et al., Khosla and Lowe (1967) also evaluated weight-height indices 
in the absence of any measures of adiposity, simply by comparing them to the distribu- 
tions of weight for height and inferring their relationship to adiposity from this compari- 
son. They felt that Quetelet’s index was the best and used it to compare groups within an 
industrial population that differed in height. This comparison demonstrated that the senior 
staff weighed more than the wage-earners because they were taller and thus not necessarily 
because they were more obese. 

Other discussions similarly evaluated weight—height indices as indicators of adiposity de- 
spite not using any actual measures of adiposity. Evans and Prior (1969) extended Khosla and 
Lowe’s approach to Rarotongans and Pukapukans, two Polynesian populations in the Cook 
Islands. They concluded that weight/height? was satisfactory as a method of adjusting weight 
for height within genetically homogeneous groups but that it should be interpreted with cau- 
tion to compare different racial groups. Similar conclusions were reached by Lee, Kolonel, 
and Hinds (1981) who compared weight for height among five different ethnic groups in 
Hawai'i — Caucasians, Chinese, Filipinos, native Hawaiians, and Japanese — and felt that the 
relation of BMI to height differed across groups, therefore preferring the approach suggested 
by Benn (1971) of calculating a specific power of height for a given population to minimize 
correlations with height for an index. 

Florey compared three different power-type indices but firmly rejected the idea that an 
index of weight and height could serve as a measure of adiposity: “One must conclude that 
the indices are at most weight corrected for height, and should not be used as indices of 
adiposity or physique in the belief that they are valid measures of these qualities” (Florey, 
1970, p. 102). Florey felt that the appropriate nomenclature would be an “index of corrected 
weight” to remove any idea that these were indices of fatness and concluded that all three 
indices were poor measures of adiposity. 


Quetelet’s index gets a new name and a lukewarm recommendation 


In 1972, American physiologist Ancel Keys studied four indicators of weight for height (three 
weight-height indices and the percent of standard weight-for-height) and their correlation 
with body fat, as assessed by skinfold measurements in a healthy all-male sample from five 
countries, including Japanese farmers and fishermen, Bantu workers in South Africa, univer- 
sity students and executives from Minnesota, railroad workers in Italy and the US, and sam- 
ples of rural populations in Finland and Italy (Keys, Fidanza, Karvonen, Kimura, & Taylor, 
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1972). Of these indicators, he felt that the best was W/H?, which he renamed the “body mass 
index” (BMJ) and suggested as an approximate indicator of fatness for research purposes. But 
Keys’ recommendation was lukewarm: “the body mass index, W/H?, proves to be, if not 
fully satisfactory, at least as good as any other relative weight index as an indicator of relative 
obesity” (Keys, et al., p. 339). Keys rejected the idea of using BMI to label people as over- 
weight and characterized such value judgments as “scientifically indefensible” (1972, p. 341). 


BMI categories began to be used as labels 


Bray (1978) recommended a cut point for overweight at a BMI of 25 and for obesity at a BMI 
of 30. These values were loosely based on an adaptation of the 1959 Metropolitan Life tables 
from the “Obesity in Perspective” conference held in 1973 at the Fogarty Center (Bray, 
1975). The value of 25 approximately corresponded to the top of the range of acceptable or 
recommended range of weights for height and the value of 30—120 percent of the top of the 
range. Garrow (1981) suggested that obesity be classified as follows: Grade 0, BMI 20-24.9; 
Grade I, BMI 25—29.9; Grade II, BMI 30—40; and Grade II, BMI > 40. He went on to say 
that “the choice of 25, 30 and 40 as boundaries between grades 0, I, II and III is arbitrary 
and can be justified only on grounds of convenience” (Garrow, 1981, p. 3). These arbitrary 
cut points at 5-unit or 10-unit intervals ending in 0 or 5 are more likely to represent a form 
of digit preference than to correspond to any specific biological reality. These recommenda- 
tions were intermittently used but not considered definitive. 

BMI continued to be just one among numerous possible indicators of weight for height. 
The Obesity in America conference held in 1977 at the Fogarty Center recommended the use 
of an adaptation of the 1959 Metropolitan Life insurance tables to evaluate relative weight 
for clinical purposes and suggested that investigators should in addition consider using the 
BMI (Bray, 1979). A 1982 workshop on body weight, health, and longevity sponsored by 
the Nutrition Coordinating Committee of the National Institutes of Health (NIH) and the 
Centers for Disease Control (CDC) recommended that, in order to establish age-related 
desirable weights, data on weight and height should be analysed and presented separately by 
sex, age, and duration of follow-up, with age divided by decades (Simopoulos & Van Itallie, 
1984). The committee also recommended that data on weight and height be additionally 
expressed as the BMI with a median, range, and standard deviation presented for each age 
and gender group. 

Explicit definitions of overweight in terms of BMI were created in 1985 and began to 
be used for US government publications and research studies. In 1985, as part of its series 
of consensus conferences on a wide variety of topics, NIH convened a short conference on 
the health effects of obesity (Health Implications, 1985). As part of this effort, the panel inves- 
tigated several methods to express weight adjusted for height, such as weight—height tables, 
and suggested that physicians consider the use of BMI as an additional factor in evaluating 
patients. The panel described BMI as a simple measurement that minimized the effect of 
height and would be useful for descriptive or evaluative purposes. This recommendation, 
along with Keys’ observations, contributed to the gradual adoption of BMI over the next 
decade as a standard international measure of weight for height for adults. A practical factor 
at the time that limited the use of BMI was the difficulty in calculating such an index before 
the advent of pocket calculators; thus, the panel recommended that nomograms be used to 
facilitate calculations of body mass index. 

The 1985 conference recommended a definition of overweight as a BMI equal to or 
greater than 27.8 for men or 27.3 for women. To arrive at this definition, the panel combined 
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information from life insurance tables with national survey data on measured weight and 
height. These BMI cutoffs represented the sex-specific 85th percentile of the BMI distribu- 
tion for persons aged 20—29 years in the second National Health and Nutrition Examination 
Survey (NHANES ID). The rationale for selecting this age group as the reference population 
was that young adults are relatively lean, and the increase in body weight that usually occurs 
with age is due almost entirely to fat accumulation. The panel also noted that these values 
corresponded approximately to a weight 20 percent above the midpoint of the sex-specific 
median weight range across all heights for a medium frame in the 1983 Metropolitan Life 
weight-for-height tables (MLIC, 1983). These definitions were adopted within the US gov- 
ernment and widely used within the US, but not elsewhere. 

There was still no real consensus over what categories to use (Kuczmarski & Flegal, 
2000). The overweight criteria based on the BMI cutoffs of =27.8 for men and >27.3 for 
women were used to report the prevalence of overweight among US adults in every an- 
nual edition of the government publication “Health United States” beginning in 1985 and 
continuing through 1998. The quinquennial US Dietary Guidelines, beginning in 1980, 
conformed pretty closely to a BMI of 25 as the dividing line to define overweight (Flegal, 
Troiano, & Ballard-Barbash, 2001). Yet, different BMI levels were suggested in the report 
“Diet and Health: Implications for Reducing Chronic Disease Risk” (National Research 
Council, 1989), which suggested a “desirable BMI” of 20—25 for those 25-34 years old, 
increasing with age to a desirable BMI of 24—29 for those over 65. 


Weight loss treatments become a medical issue 


Weight loss was not always seen as a salient medical issue (Maddox, Anderson, & Bogdonoff, 
1966; Maddox & Liederman, 1969; Puhl & Brownell, 2001). For tax purposes, the IRS 
did not allow costs of weight loss as a medical deduction until 2002. Health insurance did 
not cover weight loss treatments (Gibbs, 1995). Research showed that, in more than half of 
doctor visits, weight and height were not even measured (Graham, 2012). A timeline review 
(Kyle, Dhurandhar, & Allison, 2016) shows the changes that took place over time to further 
the concept of obesity first as a medical issue and then as a disease. 

Weight-loss drugs had a checkered history (Colman, 2005). It was difficult to define 
clinical efficacy, and there were concerns about possible side effects. Fenfluramine had been 
approved in 1973, but only for short-term use. Renewed interest in weight-loss drugs in- 
creased in the 1990s with the advent of several new drugs. The drug combination fenflu- 
ramine/phentermine, known as fen-phen, was an anti-obesity treatment that utilized two 
anorectics (Weintraub, 1992a, 1992b). In addition to popularizing off-label use of these two 
drugs, the fen-phen studies began a transition from short-term to long-term drug treatment 
of obesity. The first drug to garner Food and Drug Administration (FDA) approval in the 
US for the long-term treatment of obesity was dexfenfluramine, an isomer of fenfluramine. 
In 1996, dexfenfluramine (marketed as Redux) was approved for longer term use. The use 
of these medicines exploded, with approximately 14 million prescriptions written for fenflu- 
ramine or dexfenfluramine from 1995 until they were withdrawn for safety reasons in 1997 
(Yanovski, 2005). Although these medicines were intended for weight loss, 25 percent or 
more of users were not overweight (Blanck, Khan, & Serdula, 2004; Khan, Serdula, Bow- 
man, & Williamson, 2001). Other medications appeared. Sibutramine, a norepinephrine 
and serotonin reuptake inhibitor, marketed in the US as Meridia, was approved by the FDA 
in 1997 but withdrawn in 2010 for safety reasons. Orlistat, marketed by Roche as Xenical, 
was approved by the FDA in 1999. Xenical was described as a potential blockbuster by an 
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industry analyst who predicted that the market could be between $5 billion and $10 billion 
(Sharpe, 1999). Sales, however, proved disappointing. 

The limited medical concern for obesity was seen as a barrier for wider acceptance of 
the use of weight-loss medications. According to a Reuters report (Bruton, 2000) entitled 
“Quest for blockbuster obesity drug vexes firms,’ companies believed that, for drugs like 
Xenical and Meridia to reach their potential, the possibility of a pharmaceutical treatment 
for obesity had to be more widely accepted. Many consumers still saw obesity more as a 
cosmetic concern than as a health issue. The Reuters report quoted Terence Hurley, a Roche 
spokesman, as saying “Part of our challenge moving forward with Xenical is to “medicalize” 
weight management to physicians.” 


WHO and NIH use BMI to define “obesity” 


An important step towards international standardization occurred in 1993. A World Health 
Organization (WHO) expert committee met in Geneva for a week and produced a lengthy 
report on the uses of anthropometry (body measurements) to assess health and nutritional 
conditions including malnutrition, stunting, thinness, and overweight for all segments of 
the population — from newborn infants to older adults (“Physical Status”, 1995). For adults, 
the panel used BMI to define three grades of overweight. They selected the tidy cut points 
of 25, 30, and 40, describing the method used to establish cut-off points as largely arbitrary 
and noting that Grade 2 overweight (BMI 30—40) was relatively common in industrialized 
countries. The panel noted that, “in essence, it has been based upon visual inspection of the 
relationship between BMI and mortality; the cut-off of 30 is based on the point of flexion 
of the curve” (“Physical Status”, 1995, p. 313). No reference was provided for this statement. 
The expert committee defined obesity as the degree of fat storage associated with clearly 
elevated health risks but noted the lack of scientific consensus on exactly what level of fat 
this might be. They stated explicitly, “there are no clearly established cut-off points for fat 
mass or fat percentage that can be translated into cut-offs for BMI” (“Physical Status”, 1995, 
p. 312). The panel did not offer any definition of obesity in terms of BMI. 

In 1995, the International Obesity Task Force (IOTF, now called World Obesity Policy & 
Prevention) was formed by Philip James, the then director of the Rowett Research Insti- 
tute (World Obesity Federation, n.d.). The object of this self-appointed task force, funded 
primarily by the pharmaceutical industry (Moynihan, 2006a), was to persuade the WHO 
to convene a special consultation in Geneva that would be solely devoted to obesity (W. P. 
James, 2008). The IOTF’s mission was to inform the world about the urgency of the prob- 
lem and to persuade governments that now was the time to act. WHO was initially reluc- 
tant but agreed to hold the consultation on the condition that it be delayed for six months. 
Eventually, WHO convened a three-day expert consultation on obesity in 1997, resulting 
in a lengthy report published in 2000 (WHO, 2000). The consultation received a substantial 
grant from the IOTF (W. P. T. James, 2002), and the IOTF itself prepared the draft report 
for the consultation, which was accepted with only minor modifications. The final WHO 
report largely followed the IOTF document. 

The publication of the official report of the WHO consultation was delayed (W. P. James, 
2008). The proposal had not been part of the WHO usual planning process nor agreed 
to by the WHO Executive Board. However, subsequent discussions with the then-WHO 
Executive Director resulted in WHO deciding to publish the report as part of the WHO 
Technical Report Series, “Obesity: Preventing and Managing the Global Epidemic” (2000). 
Due to backlogs in report preparation, the Technical Report Series publication was delayed. 
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However, WHO agreed to issue an interim document in 1998 (“Obesity: Preventing and 
Managing the Global Epidemic”, 1998) that differed slightly from the final version. The 
IOTF sent free copies of the interim document directly to every Minister of Health in the 
192 WHO member countries and to other interested persons and organizations. In the in- 
terim version of the report, WHO expressed deep appreciation for both the financial and 
technical contributions of the IOTF in convening the consultation. 

The 1997 WHO consultation made a key terminological change from the 1995 WHO 
report. The section on BMI included a table on the classification and described it as being 
“in agreement” with the 1995 WHO report. In fact, although the BMI cut points were the 
same (with an additional cut point at 35.0), the terminology was not. The 1997 consultation 
decided to use the term “obesity” instead of “overweight” for BMI values of 30 or above, 
with little explanation or justification for the change. BMI, heretofore a simple indicator of 
weight adjusted for height, was transformed into the definition of excess fat. 

In 1995, NIH had convened an expert panel to develop clinical practice guidelines for 
primary care practitioners, reviewing relevant scientific literature through 1997 (“Clinical 
Guidelines”, 1998). The NIH panel overlapped with the IOTF to some degree, with the 
chair of the panel and several members also being members of the IOTF (International Obe- 
sity Task Force, 2000). The NIH panel adopted the same cut points for BMI and almost the 
same terminology as the 1997 WHO definition, citing the then as yet unpublished WHO 
report as the source for these definitions. 

Suddenly, the US government was defining “obesity” as a BMI of 30 or above. Over- 
night, millions of Americans were classified into an ominous new category. The new cut 
points were described in the New York Times as providing the pharmaceutical industry with 
“a booming new market for diet pills for the obese, practically served to the companies on 
a silver platter by the government” (Stolberg, 1999). The new definitions also expanded the 
definition of overweight to include BMIs of 25 and 26. Some critics, including the former 
surgeon general C. Everett Koop, urged the panel not to broaden the definition of over- 
weight, saying “it will confuse the public and the medical community. It needlessly stigma- 
tizes millions of Americans and lacks a solid scientific rationale” (Squires, 1998). 

Overweight is one of the conditions discussed by Schwartz and Woloshin (1999), where 
expanding disease definitions have increased prevalences of relatively common conditions. 
A later example of expanding disease definitions had to do with BMI categories for children 
and adolescents. The BMI range called “overweight” for children was renamed as “obese” in 
2007 with little rationale (Moynihan, 2006b; Ogden & Flegal, 2010). 


Reimbursement for treating obesity 


In the 1990s and earlier, insurance coverage for treatment of obesity was limited (Gibbs, 
1995), in part because of language in the US Medicare Coverage Issues Manual that stated 
bluntly: “Obesity itself cannot be considered an illness... Program payment may not be 
made for treatment of obesity alone since this treatment is not reasonable and necessary for 
the diagnosis or treatment of an illness or injury” (National Coverage Determination, n.d.). 
In 2001, an IOTF member who had joined the Centers for Disease Control and Prevention 
(CDC) organized and chaired a meeting at CDC entitled “Including Obesity Treatment in 
Benefit Plans” on the topic of reimbursement of health care providers for obesity treatment. 
Following the recommendations from this workshop, CDC put in a formal request to the 
Centers for Medicare and Medicaid Services (CMS) that this language be removed from the 
manual because it was a barrier to insurance coverage for treatment of obesity. The language 
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was removed in 2004 (“National Coverage Determination”). Versions 1 through 4 of the 
coverage determination documents are accessible through CMS (“National Coverage De- 
termination”, n.d.). 

This critical move opened the door for providers to get reimbursed for obesity treatments 
and for health insurance to cover anti-obesity medications. Stern, Kazaks, and Downey 
(2005) assessed the acceptance of obesity as a chronic disease and acceptance of its treatment 
by health management organizations, private insurers, and the government as a major reim- 
bursement challenge. Baum et al. (2015) felt that although financial incentives and attitudes 
towards obesity management were changing, continuing limitations to reimbursement in- 
cluded perceptions of modest efficacy by patients and physicians alike, historical safety issues 
and regulatory obstacles. Oliver opined that “the disease characterization has less to do with 
the health consequences of excess weight and more with the various financial and political 
incentives of the weight loss industry, medical profession, and public health bureaucracy” 
(Oliver, 2006, p. 611). 


When did obesity become a disease? 


Obesity was sometimes but not always seen as a disease. Bray in 1978 asserted that “obesity 
is a symptom of disease, like hypertension, anemia or fever, not a disease itself” (Bray, 1978, 
p. 102). References to obesity as a disease began to creep into the literature. The summary 
of the 1985 NIH conference alluded indirectly to obesity as a disease (“Health Implications”, 
1985). Just over a decade later, the 1998 NIH guidelines stated forthrightly, “Obesity is a 
complex multifactorial chronic disease” (“Clinical Guidelines”, p xi). The Obesity Society 
(TOS) in the US (Allison et al., 2008) and the World Obesity Federation (Bray, Kim, & 
Wilding, 2017) also endorsed the view that obesity should be considered a disease. In 2013, 
the American Medical Association (AMA) recognized obesity as a chronic disease (AMA, 
2013a, p. 461), although the AMA’s own Council on Science and Public Health had recom- 
mended against adopting the resolution (AMA, 2013b, pp. 335-343). European guidelines 
also endorsed the view of obesity as a disease (Yumuk et al., 2015; Frithbeck et al., 2019), not 
without some discussion (Müller & Geisler, 2017; Vallgarda et al., 2017). 


Obesity becomes a disease - but what is obesity? 


Discussions of obesity as a disease typically include extensive discussions of the definition of 
a disease but little or no discussion of the definition of obesity. According to the 1985 confer- 
ence report, “Obesity is an excess of body fat frequently resulting in a significant impairment 
of health” (“Health Implications”, 1985, p. 1073) but “because the amount of body fat... is 
a continuous variable within the population, all quantitative definitions of obesity must be 
arbitrary” (“Health Implications”, 1985, p. 1074). WHO states “Overweight and obesity are 
defined as abnormal or excessive fat accumulation that may impair health” (WHO, 2020). 
But how is an excess of body fat defined? And what health impairments are involved? 

Bray (1976) noted the major difficulties in identifying how much body fat should be con- 
sidered obese and suggested that, as a working guideline, men with more than 20 percent 
fat and women with more than 28 percent fat should probably be considered obese. The 
report from the 1977 conference at the Fogarty Center (Bray, 1979) noted that, once a mea- 
surement of body fat was established, there was still the problem of interpreting the results 
because no one knew how much body fat was normal or desirable. The 1995 report from 
the WHO expert committee similarly noted the lack of any consensus (“Physical Status”, 
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1995, p. 420). One set of standards was developed by Lohman, Houtkooper, & Going (1997), 
who estimated percent body fat with prediction equations applied to triceps and subscapular 
skinfold measurements in NHANES. They then presented age and sex-specific percent body 
fat standards based on the population distribution of the values of the estimated percent fat, 
although it is not clear how these standards were determined. In a comprehensive literature 
review of the performance of anthropometric measurements relative to body fat reference 
standards, Sommer et al. (2020) found that the cut-offs for the reference tests ranged from 
> 30 percent to => 43 percent body fat in women and from = 20 percent to > 34.6 percent 
in men using DXA, with most studies using a body fat percentage >35 percent in women 
and > 25 percent in men as the standard for defining obesity (similar but not identical to the 
Lohman criteria). Sommer et al. note that despite the wide use of these and other cut-offs, 
cut-offs were chosen arbitrarily and not necessarily with any scientific basis. 

Even though it is unclear where these body fat criteria come from or why they should be 
considered to represent obesity, numerous articles, as reviewed by Sommer et al. (2020), have 
compared a BMI of 30 or above to body fat criteria and found that not only do most people 
with BMI of 30 meet the body fat criteria for obesity but also many people with BMI below 
30 also meet these criteria, sometimes leading to the suggestion that the BMI criteria for 
obesity should be lowered. For instance, Blew et al. (2002) suggest that BMI >25 rather than 
BMI > 30 may be superior for diagnosing obesity in postmenopausal women. 

In fact, it is clear that obesity considered as the degree of fat storage associated with 
elevated health risks does not have an exact definition. Rather, the level of body fat asso- 
ciated with health risks varies considerably. According to a 2011 scientific statement from 
the American Heart Association (Cornier et al., 2011), at a given BMI, there is a significant 
variability in individual body fatness and the associated risk for health conditions. This 
variability is related to factors such as age, sex, genetics, and ethnicity, but it is also a result 
of differences in body fat distribution and composition. Similarly, a 2017 statement from 
two professional endocrinology societies states that, scientifically speaking, BMI is a poor 
predictor of health and is inadequate as the sole guide for clinical decision-making. BMI at 
any level is not clinically sufficient as a medical diagnosis of disease for a given individual 
(Mechanick, Hurley, & Garvey, 2017). 

Florey cautioned in 1970 that BMI and other indices should not be used as measures of 
adiposity. In 1972, Keys gave BMI a lukewarm recommendation and pointed out that it was 
“scientifically unacceptable” to use BMI to define overweight. Nonetheless, despite these 
early cautions, BMI came into widespread use as a measure of adiposity and was used to 
define overweight and obesity. Going full circle, BMI then became the target of numerous 
commentaries that criticized it for not measuring body composition and for being a poor 
measure of obesity (e.g., Ahima & Lazar, 2013; Blundell, Dulloo, Salvador, & Fruhbeck, 
2014; Burkhauser & Cawley, 2008; Flint & Rimm, 2006; Franzosi, 2006; Gonzalez, Cor- 
reia, & Heymsfield, 2017; Green, 2016; Kahn & Bullard, 2016; Kragelund & Omland, 2005; 
Miiller, Braun, Enderle, & Bosy-Westphal, 2016). 

Some attempts to clarify and rationalize definitions of obesity continue to use the BMI 
cut points of 25 and 30 but add additional criteria. These include the Edmonton Obesity 
Staging System developed by Sharma and Kushner (2009), the proposed definition by Gar- 
vey and Mechanick (2020) as Adiposity-based Chronic Disease (ABCD), and the European 
approach described by Hebebrand et al. (2017). All these approaches conserve the use of BMI 
as a measure of adiposity and continue to use the same BMI categories. There is little critical 
discussion of the cut point of 30 as opposed to other possible higher or lower cut points; 
rather, the value of 30 seems to be accepted uncritically, demonstrating the proposition that, 
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once a label is established, there is a tendency to accept it rather than examine its accuracy 
(Foroni & Rothbart, 2013). 

There is also little discussion of labelling people as having or not having a “normal weight” 
or “healthy weight” also simply on the basis of their BMI, although there is no clear defini- 
tion of either of these terms either. In many, if not most countries in North America, Europe, 
and Latin America, well over half the population is above “normal” weight, calling into 
question what normal means in this context. In the context of BMI categories, “normal” is 
associated with younger people, people of white or Asian ethnicity, and wealthier people. 
The NIH report states, “Overweight and obesity are especially evident in some minority 
groups, as well as in those with lower incomes and less education” (“Clinical Guidelines”, 
p. xi). In fact, a common finding is that people in the overweight category of BMI have the 
same or slightly lower mortality as those in the normal weight category (Flegal, Kit, Orpana, 
& Graubard, 2013). The entire concept of a single normal or healthy weight applicable across 
age, sex, and ethnicity groupings and even across the lifespan for a given individual has been 
criticized (Dixon, Egger, Finkelstein, Kral, & Lambert et al., 2015). 

It would appear that the general population does not necessarily agree with the BMI 


‘ 


categories promulgated by WHO and NIH. This is sometimes referred to as “mispercep- 
tion,” but it might better be called “disagreement.” In Denmark, 70 percent of men and 50 
percent of women who self-reported a BMI of 25 or above did not view themselves as over- 
weight (Matthiessen et al., 2014). In Mauritius, where 50 percent of adults were classified as 
overweight or obese, 45 percent of overweight or obese men by the WHO categories and 
38 percent of women misclassified their status (Caleyachetty, Kengne, Muennig, Rutter, & 
Echoufto-Tcheugui, 2016). Misclassification was higher among those who reported their 
health as good or excellent. In England, almost 30 percent of those with BMI levels at or 
above 25 underestimated their status relative to the WHO categories, with higher propor- 
tions of minority groups underestimating their status (Muttarak, 2018). In a large represen- 
tative Canadian sample (Herman, Hopman, & Rosenberg, 2013), 40 percent of men and 
16 percent of women who reported an overweight or obese BMI classified their weight as 
“about right.” On the other hand, 21 percent of women who reported a BMI in the “healthy 
weight range” classified themselves as overweight. In a qualitative study of older people in 
a relatively affluent rural area of the US, researchers found that participants, all with BMI 
levels of 30 or above, did not accept the designation of obesity as a disease, and many rejected 
the label of “obese” outright (Batsis et al., 2021). Researchers seem baffled by why lay peo- 
ple do not agree with their scientific categorizations, even suggesting in one case that per- 
haps misperception has increased because of improvements in fashion design for overweight 
women (Muttarak, 2018). 


Limitations and issues with current uses of BMI 


BMI is a simple way to adjust weight for height, useful for descriptive purposes and to facil- 
itate comparisons of weight among people of different heights. It slowly became medical- 
ized and ultimately became used as the definition of a disease. This is far from its original 
purpose and far from its original meaning. Because it is easy to measure and calculate, BMI 
creates a “streetlight effect” — a type of observational bias, whereby we look for our lost keys 
where the light is brightest. Attempts at international standardization have resulted in the 
creation and overuse of arbitrary BMI categories that don’t identify the same level of health 
risks across individuals or populations. These categories have become used to arrive at pop- 
ulation estimates that are in effect diagnoses of disease based solely on height and weight. 
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People are classified as having a disease without having ever received a diagnosis and indeed 
often without any medical encounter at all; yet, the estimates are treated as though they 
represented the prevalence of a clinically diagnosed disease. BMI is not a good measure of 
fat mass, but a number of studies (e.g., Han et al., 2010; Spahillari et al., 2016; Srikanthan, 
Horwich, & Tseng, 2016) have found that low muscle mass is more of a risk than high fat 
mass. Bosy-Westphal and Müller (2021) suggest that obesity should not even be considered 
as a question of body fat per se but should be addressed in terms of body composition, and 
that the use of both BMI and of body fat percentage should be avoided. They call for a new 
paradigm focused on fat-free mass instead and point out that, at older ages, a higher BMI 
may indicate more adequate fat-free mass. 

Even though the many limitations of BMI are well known, and BMI is often criticized, 
that hasn’t stopped its widespread use for purposes well beyond those for which it was in- 
tended. BMI, a simple measure of weight adjusted for height, has undergone considerable 
elaboration and transformation to now serve as a measure of disease, enmeshed in a complex 
web of medical, social, and commercial interests. The fixation on BMI may distract attention 
from more important aspects of health and disease. 


Notes 


1 Brief explanation of Quetelet’s derivation 

Let H, be the average height of the shortest men and H, be the average height of the tallest 
men. Let W, be the average weight of the shortest men and W. be the average weight of the 
tallest men. Quetelet observed that H/H, = 5/6 and that the ratio of (W./H,) to (W,,/H,,) was 
also 5/6. Therefore, H/H, = (W, /H, W, /H,,). Applying elementary algebra to rearrange the 
terms, multiplying both sides of the equation by H, and dividing both sides by H, show that (H 
squared)/(H,, squared) = W./W,,. Thus, as Quetelet observed, the ratio of weights at different 
heights is proportional to the ratio of the square of the heights. Now dividing both sides by H, 
squared and multiplying both sides by W., show that W./(H, squared) = W,/(H, squared). 
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